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For given computational resources, the accuracy of plasma simulations using particles is mainly 
held back by the noise due to limited statistical sampling in the reconstruction of the particle dis- 
tribution function. A method based on wavelet analysis is proposed and tested to reduce this noise. 
The method, known as wavelet based density estimation (WBDE), was previously introduced in 
the statistical literature to estimate probability densities given a finite number of independent mea- 
surements. Its novel application to plasma simulations can be viewed as a natural extension of the 
finite size particles (FSP) approach, with the advantage of estimating more accurately distribution 
functions that have localized sharp features. The proposed method preserves the moments of the 
particle distribution function to a good level of accuracy, has no constraints on the dimensionality of 
the system, does not require an a priori selection of a global smoothing scale, and its able to adapt 
locally to the smoothness of the density based on the given discrete particle data. Most importantly, 
the computational cost of the denoising stage is of the same order as one time step of a FSP simu- 
lation. The method is compared with a recently proposed proper orthogonal decomposition based 
method, and it is tested with three particle data sets that involve different levels of coUisionality 
and interaction with external and self-consistent fields. 



I. INTRODUCTION 

Particle-based numerical methods are routinely used 
in plasma physics calculations [HE]. In many cases 
these methods are more efficient and simpler to imple- 
ment than the corresponding continuum Eulerian meth- 
ods. However, particle methods face the well known sta- 
tistical sampling limitation of attempting to simulate a 
physical system containing iV particles using Np <^ N 
computational particles. Particle methods do not seek 
to reproduce the exact individual behavior of the parti- 
cles, but rather to approximate statistical macroscopic 
quantities like density, current, and temperature. These 
quantities are determined from the particle distribution 
function. Therefore, a problem of relevance for the suc- 
cess of particle-based simulations is the reconstruction of 
the particle distribution function from discrete particle 
data. 

The difference between the distribution function re- 
constructed from a simulation using Np particles and 
the exact distribution function gives rise to a discretiza- 
tion error generically known as "particle noise" due to 
its random-like character. Understanding and reducing 
this error is a complex problem of importance in the val- 



idation and verification of particle codes, see for example 
Refs. [31 131 |S] and references therein for a discussion in 
the context of gyrokinetic calculations. One obvious way 
to reduce particle noise is by increasing the number of 
computational particles. However, the unfavorable scal- 
ing of the error with the number of particles, ^ 1/ ^jNp 
13 [Z] J puts a severe limitation on this straightforward ap- 
proach. This has motivated the development of various 
noise reduction techniques including finite size particles 
(FSP) j8,.S,, Monte-Carlo methods [7^, Fourier-filtering 
[TU] . coarse-graining [TT], Krook operators [5], smooth 
interpolation [T?, low noise collision operators [13], and 
Proper Orthogonal Decomposition (POD) methods [2] 
among others. 

In the present paper we propose a wavelet-based 
method for noise reduction in the reconstruction of par- 
ticle distribution functions from particle simulation data. 
The method, known as Wavelet Based Density Estima- 
tion (WBDE), was originally introduced in Ref. [15^ in 
the context of statistics to estimate probability densi- 
ties given a finite number of independent measurements. 
However, to our knowledge, this method has not been 
applied before to particle-base computations. WBDE, as 
used here, is based on truncations of the wavelet repre- 
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sentation of the Dirac delta function associated with each 
particle. The method yields almost optimal results for 
functions with unknown local smoothness without com- 
promising computational efficiency, assuming that the 
particles' coordinates are statistically independent. As 
a first step in the application of the WBDE method to 
plasma particle simulations, we limit attention to "pas- 
sive denoising". That is the WBDE method is treated 
as a post-processing technique applied to independently 
generated particle data. The problem of "active denois- 
ing" , e.g. the application of WBDE methods in the eval- 
uation of self-consistent fields in particle in cell simula- 
tions, will not be addressed. This simplification will allow 
us to assess the efficiency of the proposed noise reduction 
method in a simple setting. Another simplification per- 
tains the dimensionality. Here, for the sake of simplicity, 
we limit attention to the reconstruction and denoising 
problem in two dimensions. However, the extension of 
the WBDE method to higher dimensions is in principle 
straightforward . 

Collisions, or the absence of them, play an important 
role in plasma transport problems. Particle methods han- 
dle the collisional and non-collisional parts of the dynam- 
ics differently. Fokker-Planck-type collision operators arc 
typically introduced in particle methods using Langevin- 
type stochastic differential equations. On the other hand, 
the non-collisional part of the dynamics is described using 
deterministic ordinary differential equations. Collisional 
dominated problems tend to washout small scale struc- 
tures whereas coUisionless problems typically develop fine 
scale filamentary structures in phase space. Therefore, it 
is important to test the dependence of the efficiency of 
denoising reconstruction methods on the level of coUision- 
ality. Here we test the WBDE method in strongly col- 
lisional, weakly collisional and coUisionless regimes. For 
the strongly collisional regime we consider particle data 
of force-free collisional relaxation involving energy and 
pinch-angle scattering. The weakly collisional regime is 
illustrated using guiding-center particle data of a mag- 
netically confined plasma in toroidal geometry. The col- 
lisionless regime is studied using particle in cell (PIC) 
data corresponding to bump-on-tail and two streams in- 
stabilities in the Vlasov-Poisson system. 

Beyond the role of collisions, the data sets that we are 
considering open the possibility of exploring the role of 
external and self-consistent fields in the reconstruction of 
the particle density. In the collisional relaxation problem 
no forces act on the particles, in the guiding-center prob- 
lem particles interact with an external magnetic field, 
and in the Vlasov-Poisson problem particle interactions 
are incorporated through a self-consistent electrostatic 
mean field. One of the goals of this paper is to compare 
the WBDE method with the Proper Orthogonal Decom- 
position (POD) density reconstruction method proposed 
in Ref. [14j. 

The rest of the paper is organized as follows. In Sect. II 
we review the main properties of kernel density estima- 
tion (KDE) and show its relationship with finite size par- 



ticles (FSP). We then review basic notions on orthogo- 
nal wavelet and multiresolution analysis and outline a 
step by step algorithm for WBDE. Also, for complete- 
ness, in this section we include a brief description of 
the POD reconstruction method proposed in Ref. [14J. 
Section III discusses applications of the WBDE method 
and the comparison with the POD method. We start 
by post-processing a simulation of plasma relaxation by 
random collisions against a background thermostat. We 
then turn to a 6f Monte-Carlo simulation in toroidal 
geometry, whose phase space has been reduced to two 
dimensions. Finally, we analyze the results of particle- 
in-cell (PIC) simulations of a ID Vlasov-Poisson plasma. 
The conclusions are presented in Sec. IV. 



II. METHODS 

This section presents the wavelet-based density esti- 
mation (WBDE) algorithm. We start by reviewing basic 
ideas on kernel density estimation (KDE) which is closely 
related to the use of finite size particles (FSP) in PIC sim- 
ulations. Following this, we we give a brief introduction 
to wavelet analysis and discuss the WBDE algorithm. 
For completeness, we also include a brief summary of the 
POD approach. 



A. Kernel density estimation 

Given a sequence of independent and identically dis- 
tributed measurements, the nonparametric density esti- 
mation problem consists in finding the underlying prob- 
ability density function (PDF), with no a priori assump- 
tions on its functional form. Here we discuss general ideas 
on this difficult problem for which a variety of statisti- 
cal methods have been developed. Further details can be 
found in the statistics literature, e.g. Ref. [T5] . 

Consider a number Np of statistically independent 
particles with phase space coordinates {^n)i<n<Np dis- 
tributed in M'^ according to a PDF /. This data can 
come from a PIC or a Monte-Carlo, full / or Sf simula- 
tion. Formally, the sample PDF can be written as 

P n=l 

where d is the Dirac distribution. Because of its lack of 
smoothness, Eq. ([T]) is far from the actual distribution 
/ according to most reasonable definitions of the error. 
Moreover, the dependence of on the statistical fluc- 
tuations in (X„) can lead to an artificial increase of the 
coUisionality of the plasma. 

The simplest method to introduce some smoothness in 
f is to use a histogram. Consider a tiling of the phase 
space by a Cartesian grid with cells. Let {-BaIasa 
denote the set of all cells with characteristic function xx 
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defined as xa = 1 if 2; G and xa = otherwise. Then 
the histogram corresponding to the tiling is 



Xa(x) 



(2) 



which can also be viewed as the orthogonal projection of 
on the space spanned by the x\- The main difference 
between and is that the latter cannot vary at scales 
finer than the grid scale which is of order Ng'^. By choos- 
ing Ng small enough, it is therefore possible to reduce the 
variance of to very low levels, but the estimate then 
becomes more and more biased towards a piecewise con- 
tinuous function, which is not smooth enough to be the 
true density. Histograms correspond to the nearest grid 
point (NGP) charge assignment scheme used in the early 
days of plasma physics computations [8] . 

One of the most popular methods to achieve higher 
level of smoothness is kernel density estimation (KDE) 
[17] . Given (X.n)i<n<Np, the kernel estimate of / is de- 
fined as 



J \ J AT 



n— 1 



(3) 



where the smoothing kernel K is a, positive definite, nor- 
malized, J K ^ 1, function. Equation (l3|) corresponds 
to the convolution of K with the Dirac aelta measure 
corresponding to each particle. A typical example is the 
Gaussian kernel 
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where the so-called "bandwidth" , or smoothing scale, h, 
is a free parameter. The optimal smoothing scale de- 
pends on how the error is measured. For example, in 
the one dimensional case, to minimize the mean L^-error 

between the estimate and the true density, the smooth- 

. _i 
ing volume h should scale like Np = , and the resulting 

— - 

error scales like Np ^ [TB]. As in the case of histograms, 
the choice of h relies on a trade-off between variance and 
bias. In the context of plasma physics simulations the 
kernel K corresponds to the charge assignment function 

A significant effort has been devoted in the choice of 
the function K since it has a strong impact on computa- 
tional efficiency and on the conservation of global quanti- 
ties. Concerning h, it has been shown that it should not 
be much larger than the Debye length A^i of the plasma 
to obtain a realistic and stable simulation Given a 
certain amount of computational resources, the general 
tendency has thus been to reduce h as far as possible 
in order to fit more Debye lengths inside the simulation 
domain, which means that the effort has been concen- 
trated on reducing the bias term in the error. Since 
the force fields depend on / through integral equations, 
like the Poisson equation, that tend to reduce the high 



wavenumber noise, we do not expect the disastrous scal- 
ing h oc Np~'^, which would mean Np a in d di- 
mensions, to hold. Nevertheless, the problem remains 
that if we want to preserve high resolution features of 
/ or of the electromagnetic fields, we need to reduce h, 
and therefore greatly increase the number of particles to 
prevent the simulation from drowning into noise. Band- 
width selection has long been recognized as the central 
issue in kernel density estimation pJJ . We are not aware 
of a theoretical or numerical prediction of the optimal 
value of h taking into account the noise term. To bypass 
this difficulty, it is possible to use new statistical meth- 
ods which do not force us to choose a global smoothing 
parameter. Instead, they adapt locally to the behavior 
of the density / based on the available data. Wavelet 
based-density estimation, which we will introduce in the 
next two sections, is one of these methods. 



B. Bases of orthogonal wavelets 

Wavelets are a standard mathematical tool to analyze 
and compute non stationary signals. Here we recall basic 
concepts and definitions. Further details can be found 
in Ref. |19j and references therein. The construction 
takes place in the Hilbert space L^(R) of square inte- 
grable functions. An orthonormal family ('ipj^i(x))j^iq,i<£Z 
is called a wavelet family when its members are dilations 
and translations of a fixed function ■0 called the mother 
wavelet: 



(5) 



where j indexes the scale of the wavelets and i their po- 
sitions, and -0 satisfies J ijj = 0. In the following we shall 
always assume that ijj has compact support of length S. 
The coefficients (/ | tpj,i) = / /V'j.i of a function / for 
this family are denoted by (fj.i)- These coefficients de- 
scribe the fiuctuations of / at scale 2"-' around position 
^. Large values of j correspond to fine scales, and small 
values to coarse scales. Some members of the commonly 
used Daubechies 6 wavelet family are shown in the left 
panel of Fig. 1. 

It can be shown that the orthogonal complement in 
L^(IR) of the linear space spanned by the wavelets is itself 
orthogonally spanned by the translates of a function ip, 
called the scaling function. Defining 



ipL.i = 2^ip{2^x - i) 



(6) 



and the scaling coefficients Jl,- 
the reconstruction formula: 



(/ I fL.i); one thus has 



00 00 



(7) 



i— — oo j—L i— — 00 

The first sum on the right hand side of Eq. Q is a smooth 
approximation of / at the coarse scale, 2^ , and the sec- 
ond sum corresponds to the addition of details at succes- 
sively finer scales. 
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If the wavelet -0 has M vanishing moments: 



x"'i^{x)dx = 



(8) 



for Q < m < AI , and if / is locally m times contin- 
uously differentiable around some point x^, then a key 
property of the wavelet expansion is that the coefficients 
located near xq decay when j ^ oo like 2~-'(™+2) pU] , 
Hence, localized singularities or sharp features in / affect 
only a finite number of wavelet coefficients within each 
scale. Another important consequence of ([8| of special 
relevance to particle methods is that for < m < M, the 
moments J x"^ f(x)dx of the particle distribution func- 
tion depend only on its scaling coefficients, and not on 
its wavelet coefficients. 

If the scaling coefficients fj^ at a certain scale J 
are known, all the wavelet coefficients at coarser scales 
(j < J) can be computed using the fast wavelet trans- 
form (FWT) algorithm 21J. We shall address the issue of 
computing the scaling coefficients themselves in section 

m\ 

The generalization to d dimensions involves tensor 
products of wavelets and scaling functions at the same 
scale. For example, given a wavelet basis on K, a wavelet 
basis on can be constructed in the following way: 

V'k,., (2^1,2^2) = 2X2^X1 - *i)^(2-'x2 - ia) (9) 

= 2^Vi2'xi-l,MVx2-i2) (10) 

i^h^rM^^^) = 2X2^X1 - zi)0(2^a;2 - *2) , (11) 

where we refer to the exponent fi = 1,2,3 as the di- 
rection of the wavelets. This name is easily understood 
by looking at different wavelets shown in Fig. [l] (right). 
The corresponding scaling functions are simply given by 
2^(p{2^xi — ii)(p{2^X2 — 12)- Wavelets on M.'^ are con- 
structed exactly in the same way, but this time using 
1^ — \ directions. To lighten the notation we write the 
d-dimensional analog of Eq. ([t]) as 



(12) 



where A = (j, i, /i) is a multi-index, with the integer j 
denoting the scale and the integer vector i = (ii, 12, . . .) 
denoting the position of the wavelet. 

The wavelet multiresolution reconstruction formula in 
Eq. ([t]) involves an infinite sum over the position index i. 
One way of dealing with this sum is to determine a priori 
the non-zero coefficients in Eq. ([t]), and work only with 
these coefficients, but still retaining the full wavelet basis 
on as presented above. Another alternative, which we 
have chosen because it is easier to implement, is to peri- 
odize the wavelet transform on a bounded domain |21j . 
Assuming that the coordinates have been rescaled so that 
all the particles lie in [0, 1]'', we replace the wavelets and 



scaling functions by their periodized counterparts: 

00 

V'j,i(a;) -> E "0^,1(2; + (13) 

00 

"PiA^) E + (14) 



Throughout this paper we will consider only periodic 
wavelets. For the sake of completeness we mention a 
third alternative which is technically more complicated. 
It consists in constructing a wavelet basis on a bounded 
interval The advantage of this approach is that it 

does not introduce artificially large wavelet coefficients 
at the boundaries for functions / that are not periodic. 



C. Wavelet based density estimation 

The multiscale nature of wavelets allows them to adapt 
locally to the smoothness of the analyzed function |21j . 
This fundamental property has triggered their use in a 
variety of problems. One of their most fruitful applica- 
tions has been the denoising of intermittent signals [23] . 
The practical success of wavelet thresholding to reduce 
noise relies on the observation that the expansion of sig- 
nals in a wavelet basis is typically sparse. Sparsity means 
that the interesting features of the signal are well summa- 
rized by a small fraction of large wavelet coefficients. On 
the contrary, the variance of the noise is spread over all 
the coefficients appearing in Eq. (12 1. Although the few 



large coefficients are of course also affected by noise, cur- 
ing the noise in the small coefficients is already a very 
good improvement. The original setting of this tech- 
nique, hereafter referred to as global wavelet shrinkage, 
requires the noise to be additive, stationary, Gaussian 
and white. It found a first application in plasma physics 
in Ref . [2T , where coherent bursts were extracted out of 
plasma density signals. Since Ref. [23i, wavelet denois- 
ing has been extended to a number of more general sit- 
uations, like non-Gaussian or correlated additive noise, 
or to denoise the spectra of locally stationary time se- 
ries In particular, the same ideas were developed 
in Ref. [151 EB] to propose a wavelet-based density esti- 
mation (WBDE) method based on independent observa- 
tions. At this point we would like to stress that WBDE 
assumes nothing about the Gaussianity of the noise or 
whether or not it is stationary. In fact, under the inde- 
pendence hypothesis - which is admittedly quite strong 
- the statistical properties of the noise are entirely de- 
termined by standard probability theory. We refer to 
Ref. J?7] for a review on the applications of wavelets in 
statistics. In Ref. |2.8^, global wavelet shrinkage was ap- 
plied directly to the charge density of a 2D PIC code, in a 
case were the statistical fluctuations were quasi Gaussian 
and stationary. In particular, an iterative algorithm [29], 
which crucially relies on the stationnarity hypothesis, was 
used to determine the level of fluctuations. However, in 
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the next section we will show an example where the noise 
is clearly non-stationary, and this procedure fails. 

Let us now describe the WBDE method as we have 
generalized it to several dimensions. The first step is to 
expand the sample particle distribution function, f^, in 
Eq. ([T| in a wavelet basis according to Eq. (12) with the 
wavelet coefficients 

/a - (/M^a)-^E^>™ (15) 



P n=l 



(16) 



Since this reconstruction is exact, keeping all the wavelet 
coefficients does not improve the smoothness of The 
simple and yet efficient remedy consists in keeping only a 
subset of the wavelet coefficients in Eq. ( [T2| . A straight- 
forward prescription would be to discard all the wavelet 
coefficients at scales finer than a cut-off scale L. This ap- 
proach corresponds to a generalization of the histogram 
method in Eq. ^ with Ng — 2^. Because the char- 
acteristic functions x\ of cells in a dyadic grid are 
the scaling functions associated with the Haar wavelet 
family, Eqs. ([l2| and ^ are in fact equivalent for this 
wavelet family. Accordingly, like in the histogram case, 
we would have to choose L quite low to obtain a sta- 
ble estimate, at the risk of losing some sharp features 
of /. Better results can be obtained by keeping some 
wavelet coefficients down to a much finer scale J > L. 
However, to prevent that statistical fluctuations contam- 
inate the estimate, only those coefficients whose modu- 
lus are above a certain threshold should be kept. We 
are thus naturally led to a nonlinear thresholding proce- 
dure. In the one dimensional case, values of J, L, and 
of the threshold within each scale that yield theoretically 
optimal results have been given in Ref. |15j . This ref- 
erence discusses the precise smoothness requirements on 
/, which can accommodate well localized singularities, 
like shocks and filamentary structures known to arise in 
coUisionless plasma simulations. There remains the ques- 
tion of how to compute the fj^i based on the positions of 
the particles. Although more accurate methods based on 
( 15 1 may be developed in the future, our present approx- 



imation relies on the computation of a histogram, which 
creates errors of order N~^. The complete procedure is 
described in the following Wavelet- based density es- 
timation algorithm: 



1. construct a histogram of the particle data with 
Ng — cells in each direction, 

2. approximate the scaling coefficients at the finest 
scale Jg by : 



fj 



(17) 



3. compute all the needed wavelet coefficients using 
the FWT algorithm. 



4. keep all the coefficients for scales coarser than L, 

defined by 2"^^ ~ Np^^''° where is the order of 
regularity of the wavelet (1 in our case), 

5. discard all the coefficients for scales strictly finer 

-)dj „, 



than J defined by 2 



los, Af„ ' 



6. for scales j in between L and J, keep only the 
wavelet coefficients f\ such that |/a| > Tj = 

C Jj- where C is a constant that must in prin- 
ciple depend on the smoothness of / and on the 
wavelet family [TS]. 

In the following, except otherwise indicated, C — ^. 

For the wavelet bases we used orthonormal Daubechies 

wavelets with 6 vanishing moments and thus support of 

size S" = 12 |30j . In our case, — \, which means 

that the wavelets have a first derivative but no second 

derivative, and the size of the wavelets at scale L for 
_ 1 

d = 1 is roughly Np ^ . Since Np 1 , it follows from the 
definition at stage 5 of the algorithm that the size of the 
wavelets at scale J is orders of magnitude smaller than 
that. Using the adaptive properties of wavelets, we are 
thus able to detect small scale structures of / without 
compromising the stability of the estimate. Note that 
the error at stage |2] could be reduced by using Coiflets 
PI] instead of Daubechies wavelets, but the gain would 
be negligible compared to the error made at stage [l] We 
will denote the WBDE estimate of / as . In the one- 
dimensional case, 



2^ 



(18) 



j=L i=l 



where pj is the thresholding function as defined by stage 
|6]of the algorithm : pj{y) = if |y| < Tj and Pj{y) = 1 
otherwise. 

Finally, let us propose two methods for applying 
WBDE to postprocess 5f simulations. Recall that the 
Lagrangian equations involved in the 5f schemes are 
identical to their full / counterparts. The only difficulty 
introduced by the 5f method lies in the evaluation of 
phase space integrals of the form 51 ^ j A ■ {f ~ fo), 
where A is a function on phase space and fo is a known 
reference distribution function. In these integrals, / — /o 
should be replaced by 6f, which is in turn written as 
a product wf, where w is a "weighting" function. Nu- 
merically, w is known via its values at particles posi- 
tions, w(X„), and the usual expression for SI is thus 

= En=iMXn)w{Xn). Wc cauuot apply WBDE di- 
rectly to Sf, since this function is not a density func- 
tion. An elegant approach would be to first apply WBDE 
to the unweighted distribution to determine the set 
of statistically significant wavelet coefficients, and to in- 
clude the weights only in the final reconstruction ( 18 1 
of f^. A simpler approach, which we will illustrate 
in section |IIIB[ consists in renormalizing Sf, so that 
/ = 1, and treat it like a density. 
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D. Further issues related to practical 
implementation 

In this section we discuss how the WBDE method han- 
dles two issues of direct relevance to plasma simulations: 
conservation of moments and computational efficiency. 
As mentioned before, due to the vanishing moments of 
the wavelets in Eq. ([s]) , the moments up to order M of the 
particle distribution distribution are solely determined 
by its scaling function coefficients. As a consequence, 
we expect the thresholding procedure to conserve these 
moments, in the sense that 



7''(x)dx = M 



s 



C/^(x)dx: ^ 

(19) 

for < m < Af - 1 and for all i G {!,..., d}. This 
conservation holds up to round-off error if the wavelet 
coefficients can be computed exactly. Due to the type of 
wavelets that we have used, we were not able to achieve 
this in the results presented here. There remains a small 
error related to stages 1 and 2 of the algorithm, namely 
the construction of and the approximation of the scal- 
ing function coefficients by Eq. (17 1. They are both of 



order Ng ^. We will present numerical examples of the 
moments of in the next section. 

Conservation of moments is closely related to a pe- 
culiarity of the denoised distribution function resulting 
from the WBDE algorithm: it is not necessarily every- 
where positive. Indeed, wavelets are oscillating functions 
by definition, and removing wavelet coefficients therefore 
cannot preserve positivity in general. Further studies 
are needed to assess if this creates numerical instabilities 
when is used in the computation of self-consistent 
fields. The same issue was discussed in Ref. where a 
kernel with two vanishing moments was used to linearly 
smooth the distribution function. The fact that this ker- 
nel is not everywhere positive was not considered harmful 
in this reference. We acknowledge that it may render the 
resampling of new particles from if it is needed in 
the future, more difficult. There are ways of forcing 
to be positive, for example by applying the method to 
y/ f and then taking the square of the resulting estimate, 
but this implies the loss of the moment conservation, and 
we have not pursued in this direction. 

The number of arithmetic operations to perform a fast 
wavelet transform from scale 2~"' to scale with the 
FWT in d dimensions is 232'^'''^ where S is the length 
of the wavelet filter (12 for the Daubechies filter that 
we are using). The definitions of J and L imply that 

2 

2d(J-L) gcales like , ^'l, . The cost of the binning stage 

log Np t> t> 

of order Np, so that the total cost for computing is 
0{Np), not larger than the cost of one time step dur- 
ing the simulation that produced the data. The amount 
of memory needed to store the wavelet coefficients dur- 
ing the denoising procedure is proportional to Ng, which 
should at least scale like 2'*"', and therefore also like N„. 
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FIG. 1: Daubechies 6 wavelet family. Left, bold red: scal- 
ing function tp at scale j = 5. Left, bold blue: wavelet -0 at 
scale j' = 5. Left, thin black, from left to right: wavelets 
at scales 6, 7, 8 and 9. Right : (a) 2D scaling function 
(p{x\)(p{x2) ■ (b) first 2D wavelet %p(x-i)tp(x2) ■ (c) second 2D 
wavelet ip{xi)^l)(x2)- (d) third 2D wavelet ^{x\)'4>{x2)- 



If one wishes to use a finer grid to ensure high accuracy 
conservation of moments, the storage requirements grow 
like Ng. Thanks to optimized in-place algorithms, the 
amount of additional memory needed during the compu- 
tation does not exceed 35*. Another consequence of using 
the FWT algorithm is that Ng must be an integer multi- 
ple of 2"^"^. For comparison purposes, let us recall that 
most algorithms to compute the POD have a complexity 
proportional to Ng when d—2. 

To conclude this subsection, Fig.|2]presents an example 
of the reconstruction of a ID discontinuous density that 
illustrates the difference between the KDE and WBDE 
methods. The probability density function is uniform 
on the interval [5, |] and the estimates were computed 
on [0, 1] to include the discontinuities. The sample size 
was 2^*, and the binning used Ng = 2^^ cells to com- 
pute the scaling function coefficients. For this ID case 
the value C = 2 was used to determine the thresh- 
olds (step [g] of the algorithm). The KDE estimate is 
computed using a Gaussian kernel with smoothing scale 
h — 0.0138 ^33j. The relative mean squared errors asso- 
ciated with the KDE and WBDE estimates are respec- 
tively 19.6 X 10-3 and 6.97 x lO'^. The error in the 
KDE estimate comes mostly from the smoothing of the 
discontinuities. The better performance of WBDE stems 
from the much sharper representation of these disconti- 
nuities. It is also observed that the WBDE estimate is 
not everywhere positive. The approximate conservation 
of moments is demonstrated on Table H] Note that the 
error on all these moments for could be made arbi- 
trary low by increasing Ng. The overshoots could also be 
mitigated by using nearly shift invariant wavelets |34j . 



E. Proper Orthogonal Decomposition Method 

For completeness, in this subsection we present a brief 
review of the POD density reconstruction method. For 
the sake of comparison with the WBDE method, we limit 
attention to the time independent case. Further details, 
including the reconstruction of time dependent densities 
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m = m = 1 m = 2 m — A 
1.81 • 10"^ 1.70 • 10"^ 7.52 • 10"" 3.90 • 10"^ 
f^' 1.08 • 10"" 1.52 • 10"^ 2.93 • 10"^ 5.52 • lO"'^ 

TABLE I: Relative absolute difference between the moments 
of and those of and for the distribution function 
corresponding to Fig. [2] 



FIG. 2: Estimation of the density of a sample of size 2^** 
drawn uniformly in [1/3,2/3], using Gaussian kernels (left) 
or wavelets (right). The discontinuous analytical density is 
plotted with a dashed line in the two cases. 



using POD methods can be found in Ref . [T3] . 

The first step in the POD method is to construct the 
histogram from the particle data. This density is 
represented by an Nx x Ny matrix fij containing the 
fraction of particles with coordinates {x^y) such that 
Xi < X < Xi^i and Yi < y < Yi^i. In two dimen- 
sions, the POD method is based on the singular value 
decomposition of the histogram. According to the SVD 
theorem the matrix / can always be factorized as 
/ = UWV^ where U and V are N,^ x and Ny x Ny 
orthogonal matrices, UU^ — VV* — I, and is a 
diagonal matrix, W = dia,g (wi,W2, ■■ - Wn), such that 
wi > W2 > . . . > wn > 0. with N = min(iVa;, Ny). 

In vector form, the decomposition can be expressed as 



N 



i (fe) (fc) 



k=l 



where the A^.r-dimensional vectors. 



,(fe) 



(20) 



and the N,, 



(k) 

dimensional vectors, Vj , are the orthonormal POD 
modes and correspond to the columns of the matrices U 



and V respectively. Given the decomposition in Eq. ( 20 1 
we define the rank-r approximation of / as 



(r) 



E(fc) (fc) 



(21) 



where 1 < r < A^, and define the corresponding rank-r 
reconstruction error as 



N 



3M = ||/-./('-)|p = 



(22) 



where 



— A/y^io A^, is the Frobenius norm. Since 



j(r=N) _ define e{N) — 0. The key property of 

the POD is that the approximation in Eq. (21 1 is optimal 
in the sense that 



e(r) = min |||/ — |rank((7)=r| 



(23) 



That is, of all the possible rank-r Cartesian product ap- 
proximations of /, f^''^ is the closest to / in the Frobenius 



The SVD spectrum, {wk], of noise free coherent sig- 
nals decays very rapidly after a few modes, but the spec- 
trum of noise dominated signals is relatively flat and de- 
cays very slowly. When a coherent signal is contaminated 
with low level noise, the SVD spectrum exhibits an ini- 
tial rapid decay followed by a weakly decaying spectrum 
known as the noisy plateau. In the POD method the 
denoised density is defined as the truncation = f''^''\ 
where Tc corresponds to the rank where the noisy plateau 
starts. In general it is difficult to provide a precise a priori 
estimate of r^, and this is one of the potential limitations 
of the POD method. One possible quantitative criterion 
used in Ref. [14 is to consider the relative decay of the 
spectrum, A(A;) — [wk+i — Wk)/{w2 — wi), for fc > 1, 
and define Vc by the condition A(rc) < Ac where Ac is a 
predetermined threshold. 



III. APPLICATIONS 

In this section, we apply the WBDE method to re- 
construct and denoise the particle distribution function 
starting from discrete particle data. The data corre- 
sponds to three different groups of simulations: coUi- 
sional thermalization with a background plasma, guid- 
ing center transport in toroidal geometry, and Vlasov- 
Poisson electrostatic instabilities. The first two groups 
of simulations were analyzed using POD methods in 
Ref. [14]. One of the goals of this section is to com- 
pare the POD method with the WBDE method in these 
cases and in a new Vlasov-Poisson data set. This data set 
allows the testing of the reconstruction algorithms in a 
coUisionless system that incorporates the self-consistent 
evaluation of the forces acting on the particles, as op- 
posed to the coUisional, test particle problems analyzed 
before. When comparing the two methods it is impor- 
tant to keep in mind that POD has one free parameter, 
namely the number r of singular vectors that are retained 
to reconstruct the denoised distribution function. In the 
cases studied here we used a best guess for r based on 
the properties of the reconstruction. In Ref. [U] the POD 
method was developed and applied to time independent 
and time dependent data sets. However, in the com- 
parison with the WBDE method, we limit attention to 
2-dimensional time independent data sets. 

The accuracy of the reconstruction of the density at a 
fixed time t will be monitored using the absolute mean 
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square error 



E 

hi 



(24) 



where {xi, yj) are the coordinates of the nodes of a pre- 
scribed Ng X Ng grid in the space, and J'^** denotes the 
estimated density computed from a sample with Np par- 
ticles. For the WBDE method p^'^ = f^, and for the 
POD method Z*^"* — . In principle, the reference den- 
sity, f^*^-^, in Eq. (24) should be the density function 



obtained from the exact solution of the corresponding 
continuum model, e.g. the Fokker-Planck or the Vlasov- 
Poisson system. However, when no explicit solution is 
available, we will set f^'^^ = where is the his- 
togram corresponding to a simulation with a maximum 
number of particles available which in the cases reported 
here correspond to Np = 10^. We will also use the nor- 
malized error 



eo 



(25) 



A. Collisional thermalization with a background 
plasma 

This first example models the relaxation of a non equi- 
librium plasma by collisional damping and pitch angle 
scattering on a thermal background. The plasma is spa- 
tially homogeneous and is represented by an ensemble of 
Np particles in a three-dimensional velocity space. As- 
suming a strong magnetic field, the dynamics can be re- 
duced to two degrees of freedom: the magnitude of the 
particle velocity, v, and the particle pitch, A — cos 6, 
where 6 is the angle between the particle velocity and 
the magnetic field. In the continuum limit the particle 
distribution function is governed by the Fokker-Planck 
equation which in the particle description corresponds to 
the stochastic differential equations 



dX 



(26) 



dv ■ 



dt+^v^ v\\driy, (27) 



describing the evolution of w e (0, oo) and A e [—1, 1] for 
each particle, where drj\ and d-q^ are independent Wiener 
stochastic processes and vo, Vg and v\\ are functions of 
V. For further details on the model see Ref. [H] and 
references therein. 

We considered simulations with Np — lO'^, 10**, 10^ 
and 10® particles. The initial conditions of the ensemble 
of particles were obtained by sampling a distribution of 
the form 



/(w, A, i = 0) = Cw^ cxp 



(A-Ao)2 , {v-v,f 




■t = 22 
■t = 44 
■t = 72 




FIG. 3: Wavelet and POD analyses of collisional relaxation 
particle data at different fixed times, with A'p = 10^. Top left: 
absolute values of the wavelet coefficients sorted by decreasing 
order (full lines), and thresholds given by the Waveshrink 
algorithm (dashed lines). Top right: singular values of the 
histogram used to construct . Bottom left: error estimate 

1/2 a 

with respect to the run for Np — 10 as a function of 

the number of retained wavelet coefficients (full lines), error 
obtained when using the Waveshrink threshold (dashed lines), 
and error obtained using the WBDE method (dash-dotted 
lines). Bottom right: error estimate for as a function 
of the number I of retained singular values. 



where a v'^ factor has been included in the definition of 
the initial condition so that the volume element is simply 
dwd/i, C is a normalization constant, Aq = 0.25, vo = 5, 
a\ = 0.25 and (t„ = 0.75. This relatively simple problem 
is particularly well suited for the WBDE method because 
the simulated particles do not interact and therefore sta- 
tistical correlations can not build-up between them. 

Before applying the WBDE method, we analyze the 
sparsity of the wavelet expansion of and compare the 
number of modes kept and the reconstruction error for 
different thresholding rules. The plot in the upper left 
panel of Fig. [3] shows the absolute values of the wavelet 
coefficients in decreasing order at different fixed times. 
The wavelet coefficients exhibit a clear rapid decay be- 
yond the few significant modes corresponding to the gross 
shape of the Maxwellian distribution. A similar trend is 
observed in the coefficients of the POD expansion shown 
in the upper right panel of Fig.[3j However, in the wavelet 
case the exponential decay starts after more than 100 
modes, whereas in the POD case the exponential decay 
starts after only one mode. 

The two panels at the bottom of Fig. [3] show the 
square root of the reconstruction error normalized by 



(28) 



vg, yz./Ng, in the WBDE and POD methods. Because 
in this case we do not have access to the exact solu- 
tion of the corresponding Fokker-Planck equation at the 
prescribed time, we used computed using Np = 10® 
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FIG. 4: Contour-plots of estimates of / for the coUisional 
relaxation particle data. First row: Histogram method es- 
timated using Np = 10^ particles. Second row: Histogram 
method estimated using Np = lO'^ particles. Third row: POD 
method estimated using Np — 10^ particles. Fourth row: 
WBDE method estimated using A'j, = 10^ particles. The 
three columns correspond to t = 28, t = 44 and t — 72 re- 
spectively. The plots show twenty isolines, equally spaced in 
the interval [0,0.4]. 



particles as the reference density f^"^^ in Eq. (|24|). The 



error observed when applying a global threshold to the 
wavelet coefficients (bottom left panel in Fig. |3]) is min- 
imal when around 100 modes are kept whereas in the 
POD case (bottom right panel in Fig. [s]) the minimal 
error is reached with about two or three modes. Fig. [3] 
also shows the wavelet threshold obtained by applying 
the iterative algorithm based on the stationary Gaussian 
white noise hypothesis [55] • The error corresponding 
to this threshold is larger than the optimal error because 
the noise in this problem is very non-stationary due to 
the lack of statistical fluctuations in the regions were par- 
ticles are absent. In contrast, the error corresponding 
to the WBDE procedure (dash-dotted line) is typically 
smaller than the optimal error obtained by global thresh- 
olding. This is not a contradiction, because the WBDE 
procedure is not a global threshold, but a level depen- 
dent threshold. 

Figure |4] compares at different times the densities es- 
timated with the WBDE and the POD (retaining only 
three modes) methods using Np = 10^ particles with the 
histograms computed using Np = 10^ and 10^ particles. 



t = 28 t = 44 t = 72 

f" 0.14 0.17 0.12 

0.068 0.090 0.094 

f^' 0.064 0.094 0.088 

TABLE II: Normalized root mean squared error eo < |25[ ) for 
the histogram, POD and WBDE estimates of the particle 
distribution function for A'^p = 10^ at three different times of 
the Maxwellian relaxation problem. 



The key future to observe is that the level of smoothness 
of and corresponding to Np = 10^ is similar, if 
not greater, than the level of smoothness in computed 
using ten times more particles, i.e. Np — 10^ particles. 
Table ITTlsummarizes the normalized reconstruction errors 
for Np = 10^ according Eq. (pij) using with Np = 10^ 
as f^'^f . The WBDE and POD denoising methods offer 
a significant improvement, approximately by a factor 2, 
over the raw histogram method. 

A more detailed comparison of the estimates can be 
achieved by focusing on the Maxwellian final equilibrium 
state 



/m(w) 



(29) 



where, as in Eq. (28 1, the v metric factor has been in- 



cluded in the definition of the distribution. For this cal- 
culations we considered sets of particles sampled from 
Eq. (29) in the compact domain [—1,1] x [0,4]. Since 
/if is an exact equilibrium solution of the Fokker-Plack 
equation, the ensemble of particles will be in statistical 
equilibrium but it will exhibit fluctuations due to the fi- 
nite number of particles. Figure [5] shows the dependence 
of the square root of the reconstruction error, e (normal- 
ized by Ng) on the number of particles Np and the grid 
resolution Ng for the WBDE and POD methods. The 
main advantage of this example is that the exact den- 
sity can be used as the reference density /'"'^•^ in the 
evaluation of the error. 



B. CoUisional guiding center transport in toroidal 
geometry 

The previous example focused on coUisional dynamics. 
However, in addition to collisions, plasma transport in- 
volves external and self-consistent electromagnetic fields 
and it is of interest to test the particle density recon- 
struction algorithms in these more complicated settings. 
As a first step on this challenging problem we consider 
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FIG. 5: Reconstruction error, 



JV2 I 



as a function of A'p for 



the collisional relaxation particle data corresponding to the 
Maxwellian equilibrium state. Bold solid lines correspond to 
the WBDE method, bold dashed lines correspond to the POD 
method, and thin dashed lines correspond to the histogram 
method. 



a plasma subject to collisions and an externally applied 
fixed magnetic field in toroidal geometry. The choice of 
the field geometry and structure was motivated by prob- 
lems of interest to magnetically confined fusion plasmas. 
The data was presented and analyzed using POD method 
in Ref. [14]. The phase space of the simulation is five 
dimensional. However, as in Ref. |14j . we limit atten- 
tion to the denoising of the particles distribution function 
along two coordinates corresponding to the poloidal angle 
9 £ [0, 2tt] and the cosine of the pitch angle /x € [—1,1]- 
The remaining three coordinates have been averaged out 
for the purpose of this study. The 9 coordinate is peri- 
odic, but the pitch coordinate /i is not. 

An important issue to consider is that the data was 
generated using a Sf code (DELTA5D). Based on an ex- 
pansion on p/L <C 1 (where p is the characteristic Larmor 
radius and L a typical equilibrium length scale) the dis- 
tribution function is decomposed into a Maxwellian part 
/m and a first-order perturbation 6f represented as a 
collection of particles (markers) 



(5/(x) = J2 WnSii^ - X„) 



(30) 



like in Eq. ([T]) except that each marker is assigned a time 
dependent weight Wn whose time evolution depends on 
the Maxwellian background [37] . The direct use of (5/(x) 
is problematic in the WBDE method because Sf is not 
a probability density. To circumvent this problem the 
WBDE method was applied after normalizing the Sf dis- 
tribution so that / \6f\" = 1, on a 128 x 128 grid. 

Figure [g] shows contour plots of the histogram cor- 
responding to A'p = 32 X 10^, 64 X 10^, and 1024 x 10^ 
along with the WBDE and POD reconstructed densities. 
The POD reconstructions were done using r — 3 modes, 
as in Ref. [14]. It is observed that comparatively high 
levels of smoothness can be achieved with considerably 
less particles by using either the WBDE or POD recon- 
struction methods. The WBDE method provides better 
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FIG. 6: Contour plots of estimates of / for the collisional 
guiding center transport particle data: Histogram method 
(first row), POD method (second row), and WBDE method 
(third row). The left, center and right columns correspond to 
N-p = 32-10^ (left), iVp = 128-10^ (middle) and N-p = 1024-10^ 
(right) respectively. The plots show seventeen isolines equally 
spaced within the interval [—0.5,0.5]. 



results for the J/ ~ contours. This is because POD 
modes are tensor product functions, that have difficulties 
in approximating the triangular shape of these contour 
lines. Note that the boundary artifacts due to periodiza- 
tion of the Daubechies wavelets do not seem to be very 
critical. The large wavelet coefficients associated with 
the discontinuity between the values of (5/ at /i = ±1 are 
not thresholded, so that the discontinuity is preserved in 
the denoised function. Figure [Tjcompares the reconstruc- 
tion errors in the WBDE, POD, and histogram methods 
as functions of the number of particles. To evaluate the 
error we used computed using TVp = 1024 x 10'^ as 
the reference density f^^S . As in the collisional trans- 
port problem, the error is reduced roughly by a factor 2 
for both methods compared to the raw histogram. Note 
that the scaling with iVp is slightly better for WBDE than 
for POD. 



Collisionless electrostatic instabilities 



In this section we apply the WBDE and POD meth- 
ods to reconstruct the single particle distribution func- 
tion from discrete particle data obtained from PIC sim- 
ulations of a Vlasov-Poisson plasma. We consider a one- 
dimensional, electrostatic, collisionless electron plasma 
with an ion neutralizing background in a finite size do- 
main with periodic boundary conditions. In the contin- 
uum limit the dynamics of the distribution function is 
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FIG. 7: Error estimate, ^jj^, for coUisional guiding center 

3 

transport particle data according to the histogram, the POD, 
and the wavelet methods. 



FIG. 8: Electrostatic energy as a function of time in the 
Vlasov-Poisson PIC simulations of the bump on tail instability 
for different numbers of particles. The straight lines denote 
the growth rate predicted by linear stability theory |35| . 



governed by the system of equations 

dtf + vdxf + dx(f>dyf = 
= C / f{x,v,t)dv - 1 , 



(31) 
(32) 



where the variables have been non-diniensionalized us- 
ing the Debye length as length scale and the plasma fre- 
quency as time scale, and L is the length of the system 
normalized with the Debye length. Following the stan- 
dard PIC methodology [Tl , we solve the Poisson equation 
on a grid and solve the particle equations using a leap- 
frog method. The reconstruction of the charge density 
uses a triangular shape function. We consider two initial 
conditions: the first one leads to a bump on tail instabil- 
ity, and the second one to a two streams instability. 



1. Bump on tail instability 

For the bump on tail instability we initialized ensem- 
bles of particles by sampling the distribution function 



2 l-2qv + 2v^ 



37rC 



(l + t;2)2 



(33) 



using a pseudo-random number generator. This equilib- 
rium is stable for g < 1 and unstable for g > 1. The 
dispersion relation and linear stability analysis for this 
equilibrium studied in Ref. [3F was used to benchmark 
the PIC code as shown in Fig. |8] In all the computa- 
tions presented here q = 1.25 and Np = 10"*, lO'^ and 
10^. The spatial domain size was set to C = 16.52 to fit 
the wavelength of the most unstable mode. 

Since the value of q is relatively close to the marginal 
value, the instability grows weakly and is concentrated 
in a narrow band in phase space centered around the 
point where the bump is located, f « 1 in this case. 
In order to unveil the nontrivial dynamics we focus the 



analysis in the band v G (—3,3), and plot the depar- 
ture of the particle distribution function from the initial 
background equilibrium. The POD method is applied 
directly to Sf" = f"{x,v,t) - fo{x,v), but the WBDE 
method is applied to the full f^{x,v,t), and fo{x,v) is 
subtracted only for visualization. Note that because we 
are considering only a subset of phase space, the effec- 
tive numbers of particles, Np — 7318, Np — 73143 and 
Np = 731472, are smaller than the nominal numbers of 
particles, Np = 10", Np = 10^ and Np = 10^ respectively. 

Figure |9] shows contour plots of 5f, for different num- 
ber of particles. Since the instability is seeded only by 
the random fiuctuations in the initial condition, increas- 
ing Np delays the onset of the linear stability and this 
leads to a phase shift of the nonlinear saturated regime. 
To aid the comparison of the saturated regime for differ- 
ent numbers of particles we have eliminated this phase 
shift by centering the peak of the particle distributions 
in the middle of the computational domain. A 256 x 256 
grid was used in the WBDE method, and a 50 x 50 grid 
was used for the histogram and the POD methods. The 
thresholds for the POD method where r = 1, r = 2, and 
r = 3 for iVp = 10"*, Np = 10^ and Np = 10^ respec- 
tively. Except for the case where Np = 10", both the 
POD and WBDE estimates are very smooth, in agree- 
ment with the expected behavior of / for this instability. 
It is observed that the level of smoothness of the his- 
togram estimated using 10^ particles is comparable to 
the level of smoothness achieved after denoising using 
only 10^ particles. One should mention that for scales 
between L and J occurring in the WBDE algorithm we 
find that none of the wavelet coefficients are above the 
thresholds at each scale. In fact, a simple KDE estimate 
with a large enough smoothing scale would probably do 
the job pretty well for this kind of instabilities which do 
not induce abrupt variations in /. Table |III| shows the 



POD and WBDE reconstruction errors for Np = lO" and 
10^. The error is computed using formula (25 1, 
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FIG. 9: Contour plots of estimates of Sf for the bump-on- 
tail instability PIC data at t = 149: Histogram method (first 
row), POD method (second row), and WBDE method (third 
row). The left, center and right columns correspond to A'p = 
10*, A''p = lO'^ and Np = 10^ particles respectively. The plots 
show thirteen contour lines equally spaced within the interval 
[-0.0120.012]. 
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FIG. 10: Relative error on the second order moment as a 
function of the grid resolution, Ng, in the POD, WBDE, and 
histogram methods for the bump on tail instability particle 
data at t = 149, with Np = lO" particles. 



Nr, 



lO"* Np = 10^5 



0.443 
0.163 
0.173 



0.140 
0.090 
0.086 



taking for fref the histogram obtained from the simula- 
tion with iVp = 10^. 

Figure [TO] shows the relative error on the second order 
moment : 



Mi ■ 



Mi, 



where M^, defined by ( 19 1. A similar quantity is also 
represented for and The time and number of 
particles are kept fixed a,t t = 149 and Np = 10^, and the 
grid resolution is varied. As expected, and con- 
serve the second order moment with accuracy 0{Ng'^). 
The errors corresponding to is of the same order of 
magnitude but seems to reach a plateau for Ng ~ 1024. 
This may be due to the fact that for Ng > 1024, there 
is less than one particle per cell of the histogram used to 
compute fp. 



2. Two-streams instability 

As a second example we consider the standard two- 
streams instability with an initial condition consisting of 
two counter-propagating cold electron beams initially lo- 
cated a,t V = —1 and v = 1. This case is conceptually 
different to the previous one because the initial condition 
depends trivially on the velocity. Therefore, there is no 
statistical error in the sampling of the distribution and 
the noise builds up only due to the self-consistent interac- 
tions between particles. In other words, there is initially 
a strong correlation between particles' coordinates, which 



TABLE III: Comparison of normalized root mean squared 
errors eo ( |25[ ) for the raw histogram and for the WBDE and 
POD methods, for the bump-on-tail instability a,t t — 149, 
depending on the number of particles. The simulation with 
A'p = 10^ is used as a reference to compute the error. 



will eventually almost vanish. This situation offers a way 
to test robustness of the WBDE method with respect to 
the underlying decorrelation hypothesis. 

The analysis is focused on four stages of the instabil- 



Grid sizes were Ng = 1024 for 
and Ng — 128 for the two others. 



ity corresponding to t = 40, 60, 100, and 400. Fig. 11 
shows a comparison of the raw histogram, the POD and 
the WBDE reconstructed particle distribution functions 
at these four instants 
the WBDE estimate 
For t = 40, no noise seems to have affected the particle 
distribution yet, therefore a perfect denoising procedure 
should conserve the full information about the particle 
positions. Although WBDE introduces some artifacts in 
regions of phase space that should contain no particles 
at all, it remarkably preserves the global structure of the 
two streams. This is possible thanks to the numerous 
wavelet coefficients close to the sharp features in / that 
are above the thresholds, in contrast to the bump-on-tail 
case. On the next snapshot at i = 60, the filaments have 
overlapped and the system is beginning to loose its mem- 
ory due to numerical round-off errors. The fastest fila- 
ments still visible on the histogram are not preserved by 
WBDE, but the most active regions are well reproduced. 
At < = 100, the closeness between the histogram and the 
WBDE estimate is striking. To put it somewhat subjec- 
tively, one may say that WBDE did not consider most of 
the rough features present at this stage as 'noise', since 
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they are not removed. Only with the last snapshot at 
t = 400 does the WBDE estimate begin to be smoother 
than the histogram, suggesting that the nonlinear inter- 
action between particles has introduced randomization in 
the system. 

The POD method is able to track very well the small 
and large scale structures of the particle density using a 
significantly smaller number of modes. In particular, for 
t = 40, 60, 100, and 400 only r = 28, r = 27, r = 18, and 
r = 5 modes were kept. The decrease of the number of 
modes with time is a result of the lost of fine scale features 
in the distribution function. Despite this, a limitation of 
the POD method is the lack of a thresholding algorithm 
to determine the optimal number of modes a priori. 



IV. SUMMARY AND CONCLUSION 

Wavelet based density estimation was investigated as 
a post-processing tool to reduce the noise in the recon- 
struction of particle distribution functions starting from 
discrete particle data. This is a problem of direct rele- 
vance to particle-based transport calculations in plasma 
physics and related fields. In particular, particle meth- 
ods present many advantages over continuum methods, 
but have the potential drawback of introducing noise due 
to statistical sampling. 

In the context of particle in cell methods this prob- 
lem is typically approached using finite size particles. 
However, this approach, which is closely related to the 
kernel density estimation method in statistics, requires 
the choice of a smoothing scale, h, (e.g., the standard 
deviation for Gaussian shape functions) whose optimal 
value is not known a priori. A small h is desirable to fit 
as many Debye wavelengths as possible, whereas a large 
h would lead to smoother distributions. This situation 
results from the compromise between bias and variance 
in statistical estimation. To address this problem we 
proposed a wavelet based density estimation (WBDE) 
method that does not require an a priori selection of a 
global smoothing scale and that its able to adapt locally 
to the smoothness of the density based on the given dis- 
crete data. The WBDE was introduced in statistics [15] . 
In this paper we extended the method to higher dimen- 
sion and apphed it for the first time to particle-based 
calculations. The resulting method exploits the multires- 
olution properties of wavelets, has very weak dependence 
on adjustable parameters, and relies mostly on the raw 
data to separate the relevant information from the noise. 

As a first example, we analyzed a plasma coUisional re- 
laxation problem modeled by stochastic differential equa- 
tions. Thanks to the sparsity of the wavelet expansion 
of the distribution function, we have been able to ex- 
tract the information out of the statistical fluctuations 
by nonlinear thresholding of the wavelet coefficients. At 
late times, when the particle distribution approaches a 
Maxwellian state, we have been able to quantify the dif- 
ference between the denoised particle distribution func- 



tion and its analytical counterpart, thus demonstrating 
the improvement with respect to the raw histogram. The 
POD-smoothed and wavelet-smoothed particle distribu- 
tion functions were shown to be roughly equivalent in this 
respect. These results were then extended to a more com- 
plex situation simulated with a Sf code. Finally, we have 
turned to the Vlasov-Poisson problem, which includes in- 
teractions between particles via the self-consistent elec- 
tric field. The POD and WBDE methods were shown 
to yield quantitatively close results in terms of mean 
squared error for a particle distribution function resulting 
from nonlinear saturation after occurrence of a bump-on- 
tail instability. We have then studied the denoising al- 
gorithm during nonlinear evolution after the two-streams 
instability starting from two counter-streaming cold elec- 
tron beams. This initial condition violates the decorre- 
lation hypothesis underlying the WBDE algorithm, and 
thus offers a good way to test its robustness regarding 
this aspect. The WBDE method was shown to yield 
qualitatively good results without changing the threshold 
values. 

One limitation of the present work comes from the way 
denoising quality is measured. We have considered the 
quadratic error on the distribution function / as a first 
indicator of the quality of our denoising methods. How- 
ever, it may be more relevant to compute the error on the 
force fields, which determine the evolution of the simu- 
lated plasma. These forces depend on / through inte- 
grals, and statistical analysis of the estimation of / using 
weak norms, like was done in [38j in the deterministic 
case, could therefore be of great help to obtain thresh- 
old parameters more efficient than those considered in 
this study. The computational cost of our method scales 
linearly with the number of particles and with the grid 
resolution. Therefore, WBDE is an excellent candidate 
to be performed at each time step during the course of a 
simulation. Once the wavelet expansion of the denoised 
particle distribution function is known, it is possible to 
continue using the wavelet representation to solve the 
Poisson equation [39] and to compute the forces. The mo- 
ment conservation properties that we have demonstrated 
in this paper should mitigate the unavoidable dissipative 
effects implied by the smoothing stage. In Ref. [S], a 
dissipative term was introduced in a global PIC code to 
avoid unlimited growth of particle weights in Sf codes, 
and this was shown to improve long time convergence of 
the simulations. It would be of interest to assess if the 
nonlinear dissipation operator corresponding to WBDE 
has the same effect. 
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